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We define an individual-based probabilistic model of a sole (Solea solea) behaviour. The individual 
model is given in terms of an Extended Probabilistic Discrete Timed Automaton (EPDTA), a new 
formalism that is introduced in the paper and that is shown to be interpretable as a Markov decision 
process. A given EPDTA model can be probabilistically model-checked by giving a suitable transla- 
tion into syntax accepted by existing model-checkers. In order to simulate the dynamics of a given 
population of soles in different environmental scenarios, an agent-based simulation environment is 
defined in which each agent implements the behaviour of the given EPDTA model. By varying the 
probabilities and the characteristic functions embedded in the EPDTA model it is possible to repre- 
sent different scenarios and to tune the model itself by comparing the results of the simulations with 
real data about the sole stock in the North Adriatic sea, available from the recent project SoleMon. 
The simulator is presented and made available for its adaptation to other species. 

1 Introduction 

Ecosystems are composed of living animals, plants and non-living structures that exist together and 
interact with each other. Fish are part of the marine ecosystem and interact closely with their physical, 
chemical and biological environment. They are inter-dependent with the ecosystem that provides the 
right conditions for their growth, reproduction and survival. Conversely, they are a source of food for 
other animals and form an integral part of the marine food web. 

The fishing activity impacts both on the fish stocks and on the ecosystem within which they live. 
The Ecosystem Approach to Fisheries (EAF) JTTl recognises that fisheries have to be managed as part 
of their ecosystem and that the impact on the environment should be limited as much as possible. Part 
of this approach is the fish stock assessment. A "stock" is a population of a species living in a defined 
geographical area with similar biological parameters (e.g. growth, size at maturity, fecundity etc.) and 
a shared mortality rate. Its aim is to provide information to managers on the state and life history of 
the stocks. This information is used into the decision making process. Stock assessment can be made 
using mathematical and statistical models to examine the history of the stock and to make quantitative 
predictions in order to address the following questions: 1) What is the current state of the stock? 2) 
What has happened to the stock in the past? 3) What will happen to the stock in the future if different 
management choices are made? Fisheries employs a wide variety of recognised assessment models and 
statistical methods to assess the stocks of fish. If we know about the stock size (biomass) and the biology 
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of the fish stock, we can estimate how many fish can be safely removed from the stock in order to ensure 
a sustainable resource. 

Using the data from a recent research project ifTSl [T6l on the common sole (Solea solea), it was 
possible to obtain new information on the biology of this fish. These data are the input of mathematical 
models based on equations determining the stock assessment of this species. This makes possible to 
regulate the fishing effort in order to avoid overfishing. In this work we want to introduce a somewhat new 
way of addressing fish dynamic population modelling and fish stock assessment. The main characteristic 
of our approach is that it is individual-based, that is to say, every single individual of the population under 
study is considered as an independent entity and the dynamics of the overall population living in a given 
environment emerges from the individual interactions and behaviours. Every aspect of the population 
can then be observed and measured in a simulation environment. This last aspect permits the tuning and 
the validation of the individual model by using existent experimental data. 

Since systems biology was proposed as a challenge of a new way of understanding biology, it has 
involved biologists, physicians, mathematicians, physicists, computer scientists and engineers. In par- 
ticular, in the computer science community a lot of models, languages, approaches and methodologies 
have been applied in a biological context, and several formalisms have been specifically developed for 
describing different aspects of biological systems. In n, authors extend P systems with features typical 
of timed automata with the aim of describing periodic environmental events such as seasons or periodical 
hunts/harvests. In [ 13] it is proposed a modelling framework based on P systems and it is applied to the 
modelling of the dynamics of some scavenger birds in the Pyrenees. This model considers information 
about the feeding of the population. In [6 |, a spatial extension of P systems is introduced and an example 
of the evolution of ring species, based on small changes between geographically contiguous populations, 
is modelled. Authors of |25 1 present a process algebraic approach to the modelling of population dynam- 
ics. Currently no time characterisation can be provided of the modelled biological environment because 
the calculus has not a notion of time. Stamatopoulou et al. provide, in [28] and |[29l . models based 
on X-machines and P systems for biological-inspired systems such as colony of ants or bees, flocks of 
birds and so on. Besozzi et al. [10] model metapopulations (which are ecological models describing 
the interactions and the behaviour of populations living in fragmented habitats) by means of dynamical 
probabilistic P systems where additional structural features have been defined (e.g., a weighted graph as- 
sociated with the membrane structure and the reduction of maximal parallelism). Such a work effectively 
uses many regions to model an ecological system, thus it really exploits the advantages of the membrane 
structure. In [18] authors model the behaviour of a bee colony as a society of communicating agents act- 
ing in parallel and synchronising their behaviour. Two models are provided: one is based on P systems 
while the other is based on X-machines but no tool, thus no actual results, are available to compare the 
behaviour of the two models. Finally, in an older work of Bahr and Bekoff [5] authors model a flock 
in terms of cellular automata; although interesting, theirs work concentrates only on the vigilance of the 
flock and how it is affected by internal and external factors (such as flock size, number of obstacles and 
so on). 

The individual-based vision is quite natural in the computer science world, since notions such as 
process, component, activity, flow, interaction all can be easily related to an executor or a virtual entity. 
When adapting these notions to a biological scenario it is natural to reason in terms of entities that 
"do something" and probably collaborate to make the whole system function well. On the other side, 
biologists have often a view that abstracts from single entities preferring to reason in terms of aggregated 
variables, for which a continuous domain can be adopted and that are related by differential equations 
(ODE, PDE). Consequently, the available data from observations and experiments follow this way of 
thinking and are not directly interpretable in an individual-based setting. To bridge the gap there is the 



F.Buti et al. 



39 




Figure 1 : A simple EPDTA. 

need to develop methodologies and software systems that make the two worlds interact and somehow 
work in synergy to transport the advantages of each view into the other and vice versa. 

In this paper we define a formalism called Extended Probabilistic Discrete Timed Automata (EPDTA) 
that is a variant of probabilistic timed automata [23]. It simplifies the time domain, that is discrete, but 
introduces integer and boolean variables in the state. The formalism is shown to be interpretable as a 
Markov decision process and also easily translatable to a syntax that is accepted by the probabilistic 
model-checker PRISM ll20l l22l . A model of the behaviour of the common sole (Solea solea) living 
in the North Adriatic sea is then given in terms of an EPDTA by using available data from a recent 
project ||T5l[T6l . After the individual behaviour is defined we introduce a simulation environment that 
is agent-based and derives from the one developed in ll26l . Essentially, it creates a Multi-Agent System 
(MAS) in which every agent represents a sole whose internal (probabilistic) behaviour is given by the 
individual model. The MAS permits a precise monitoring of all the events occurring in the virtual square 
kilometre of sea that is simulated. The simulator, called FIShPASs (Fishing Stock Probabilistic Agent- 
based simulator), is available ||21 and easily adaptable to simulate other species. 

The paper is organised as follows: Section [2]defines EPDTAs and gives their semantics as a Markov 
decision process. Section [3] shows a particular EPDTA representing the individual behaviour of a sole 
living in the North Adriatic sea. Section |4] introduces the simulator as a MAS in which each agent 
implements the individual probabilistic behaviour described in Section [3] and shows the preliminary 
simulation results that have to be tuned/validated using the real observation data available form the 
SoleMon project. Section|5] concludes, describing some future work. 

2 Extended Probabilistic Discrete Timed Automata 

In this section we introduce EPDTAs, a variant of probabilistic timed automata that we need for our 
purpose. We then show that an automaton of this kind can be interpreted as a Markov decision process 
and that can be translated easily into one handled by the model checker PRISM [20, 22). 

Briefly, a timed automaton (TA) ||4l is an automaton equipped with real-valued clock variables and 
such that transitions are guarded by clock constraints. The control flows from a state to another instanta- 
neously if and only if the guard of that transition is enabled. Each transition has an action associated and 
can reset some clocks. While the control stays in a state, time elapse i.e. clock values increase. Possible 
conditions in states, called invariants, can prevent the passage of time forcing the control to exit from 
that state with one of the enabled transitions. 

Following the approach of ifTTl . probability has been introduced into timed automata yielding prob- 
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abilistic timed automata (PTA) |l8l|23l- In this case every transition from a state lias a clock constraint as 
a guard, but then the action, reset and destination state is given by a finite probability distribution. Thus, 
every step of a PTA consists in a resolution of non-determinism among different enabled transitions, pas- 
sage of time included, followed by a probabilistic choice of the action, reset and destination according to 
the given distribution. 

An EPDTA is essentially a PTA with the set N of natural numbers as time domain and in which 
the locations are enriched with a finite set of boolean and finite-range integer variables. Clocks, that 
are considered similar to integer variables with range [0,°°], grow at discrete steps of length 1. We also 
add a subset of actions that are called urgent and that must be executed as soon as they are enabled. 
The motivation of these variants are essentially given by the peculiarities of the models of individuals in 
ecosystems: their state-changes are typically modelled in terms of transitions that happen at a time scale 
of years, months and, in the finer grain, weeks. Thus, there is no need of continuous time. Moreover, 
each individual has some characteristics, e.g. age, sex, length, weight, fertility, last time of reproduction, 
etc. that are easily representable by integer (with finite range) and boolean variables and that influence 
its behaviour. Last characteristic of this meta-model is a constant MAX_TIME e N that represents the 
maximum number of time steps that the automaton can perform. This means that each clock has, actually, 
a range [0, MAX_TIME] . Since this constant can be chosen arbitrarily large, this requirement does not limit 
the generality of the meta-model both if it is used in a simulation environment and if it is used for model 
checking. On the other hand it permits to give a finite range to all variables of the model, clocks included. 

Now we define in detail the syntax and the semantics of EPDTAs. This part is quite technical and 
can be skipped by the non-familiar reader. Section [3] will present the model of the Solea solea behaviour 
as a particular EPDTA that can be quite intuitively understood even without knowing all the technical 
details. The idea of clock variables is central in the framework of timed automata and it is imported in 
our meta-model. A clock is a variable that takes values from the set N. Clocks measure time as it elapses, 
all clocks of a given system advance at the same pace and clock variables are ranged over by x, . . . 
We useX,X', ... to denote finite sets of clocks. A clock valuation over X is a function assigning a natural 
number to every clock. The set of valuations of X, denoted by Vx, is the set of total functions from X to 

N. Clock valuations are ranged over by V, v', Given V G Vx and « G N, we use V + nio denote the 

valuation that maps each clock x G X into v(x) + n. 

Clock variables, like other variables, can be assigned during the evolution of the system when certain 
actions are performed. The assignment consists in instantaneously set the value of a variable to a new 
value. Clock variables are always assigned to 0, i.e. they are reset. Immediately after this operation a 
clock restarts to measure time at the same pace as the others. The reset is useful to measure the time 
elapsed since the last action/event that reset the clock. Given a set X of clocks, a reset 7 is a subset of X. 
The set of all resets of clocks in X is denoted by Yx and reset sets are ranged over by 7, / , . . . Given a 
valuation V G Vx and a reset 7, we let v\7be the valuation that assign the value to every clock in 7 and 
assign v(x) to every clock x G X\7. 

We need also to consider a finite set B of boolean variables, ranged over hy b,b' , . . ., & finite set / of 
integer variables, ranged over by v, v', . • •> together with a range assignment function range : / ^ Z x Z 
such that if range (v) = (zi ,22) then za < Z2- Finally, we need a finite set F of totally specified functions, 
ranged over by /,/',.. ., that we use as tables in which constants values are collected and where then 
they can be retrieved by applying each function to values in its domain (essentially they are tables of 
probability values or array of constants). Such tables can contain rational numbers. If they are involved 
in integer operations they are rounded to the closest integer. 

The grammars introduced in the following define the syntax of a first-order language in which very 
usual functions and relations are present. The language can express boolean and arithmetic expres- 
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sions. Moreover, we define a syntax for expressing assignments to variable of corresponding type, clock 
constraints and guards. Bexp ::= tt \jf \ b\ Bexp A Bexp |~ Bexp \ Aexp = Aexp \ Aexp <= Aexp \ 
Aexp < Aexp \ (Bexp), Aexp ::= z\v\ f {Aexp, Aexp,.. Aexp + Aexp \ Aexp* Aexp \ Aexp— Aexp \ 
Aexp/Aexi^\ Aexp%Aex]^\ (Aexp) where z G Z. Assignments are of the form Assign ::= b <— Bexp \ 
V ^ Aexp I Assign, Assign. Boolean expressions are ranged over by j8,j8', . ■ •> arithmetic expressions are 
ranged over by a, a', . . . and assignments are ranged over by T] , 77', . . .. The timed behaviour of the system 
is expressed using constraints on the actual values of the clocks. Given a set X of clocks, the set *Px of 
clock constraints over X is defined by the following grammar: y '■'■= true \ false \ x#c\x — y#c\\i/^W 
where x,y G c G N, and # G {<,>,<,>,=}. Finally, guards, ranged over by g,g' , . . . are defined 
as Guard ::= Y \ Bexp \ Guard f\ Guard . As usual, we use the name of the syntactic category to de- 
note the set of the generated objects. Thus, for instance. Guard represents the set of all strings that are 
well-formed guards. 

Given sets B, I, we define an interpretation i as a function assigning a value to every variable in B and 
I. By means of an interpretation i we can evaluate a boolean expression j8 or an arithmetic expression a 
in the standard sense; we denote with Si (j3) the boolean value of j3 and with Si {a) the integer value of 
a both using the interpretation i. Moreover, we can define a satisfaction relation |= such that v ^ if 
the values of the clocks in v satisfy the constraint in the natural interpretation. Finally, the satisfaction 
relation can be extended naturally on guards: i , v |= ^. 

An assignment T] is evaluated as a change in the interpretation i. We denote with £/{i,ri) a new 
interpretation i' in which the variables that aie assigned in rj are al|^ changed with the corresponding 
values, evaluated from i in the above sense. 

Given a set H let us denote by the set of finite probability distributions over H i.e. IJ.{H) 

contains functions p: H ^ [0,1] such that Y,heH p{h) = 1 and the set {h ^ H \ p{h) > 0} is finite. A 
probability distribution p can be represented as follows: p = [h\^ p\, . . . ,hn^ Pn] where the his are 
exactly all elements of H that have p{hi) = pi > 0. 

Definition 2.1 (EPDTA). An extended probabilistic discrete timed automaton T is a tuple 
{Q,'L,B,I,X,E,,\],qo,lo,Inv), where: Q is a finite set of locations, Z is a finite alphabet of actions, B 
is a finite set of boolean variables, I is a finite set of finite-range integer variables, X is a finite set of 
clocks, E is a finite set of edges, U is a finite set of urgent edges, qo is the initial location, Iq is the initial 
interpretation of the variables of BUI, MAX_TIME is the maximum time of evolution andlnv is afunction 
assigning to every q £ Q an invariant, i.e. a clock constraint y such that for each clock valuation V G Vx 
and for each n G N^*', V + n\=\l/^v\=Y- Constraints having this property are called past-closed. 

Each edge e GEUV is a tuple in Q x Guard x jU(£ x Assign x p{X) x Q). If e = {q,g,prob) is an 
edge, q is the source, g is the guard and prob is the distribution. If prob{{a,r],Y,q')) > then there is 
a possibility for the automaton to reach the target location q' performing the action a, the assignment r\ 
and the reset 7. 

Figure[T]shows an EPDTA with three locations 10, 11, 12. The set of clocks is {x}, the alphabet is {a}, 
10 is the initial state, and the invariant of state ZO is x < 2. There is an edge starting from location 10 with 
a guard that is the conjunction of the clock constraint x> I and the boolean expression ~ b, where b ^B. 
At the edge it is associated a distribution [{ll,a,e, {x}) 1— )• 0.7, {I2,e,b ^ tt,{}) 1— )• 0.3], where e is the 

'According to what said above, this can be considered a constant. Of course the arguments of the function must be of the 
right number and of the right type. 
^Integer division. 
^Rest of integer division. 

'^Note that we suppose that 77 does not contain more than one assignment for each variable. 
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age <- age + 1 
length <- fVB(age, t) 
Fc <- ff Mc <- ff 

Rc <- ff 
ix 




PrR(t - lastB) 



Figure 2: EPDTA representing the behaviour of the sole when it is in class /. The other classes are equal. 

empty string. From 11 there is an edge in which the probability distribution is trivial. This transition is 
equivalent to a "classical" one. 

The semantics of an EPDTA is a Markov decision process. A Markov decision process (MDP) is a 
pair (S, Steps) where § is a set of states and Steps is a function giving for each state s a set of probabihty 
distributions. Each p G Steps(s) is a discrete probability distribution in /^(S), saying the probability of 
each state of being the next state s' in the process. A given MDP evolves as follows: at each step it is 
in a state s. Firstly it performs a non-deterministic choice to decide which distribution p G Steps (5) it 
will apply. Then it performs a probabilistic choice to go to a new state s' according to the chosen p. The 
process then cycles again. 

The semantics of T = (2,r,B,/,X,E,U,^o>l05^'"v) is a MDP (5,Steps) where the set of states S is 
the set of all the tuples of the form {q, v,l) where q G {rfo/?]|^ V G Vxu{t} is a valuation of the set of 
clocks X augmented with a fresh clock t that is never resej^ and i is an interpretation of the variables in 
BUI. Note that if we fix a MAX.TIME as the maximum time step for the system evolution then the set of 
states if finite as Q is finite and all variables, clock included, have a finite range. 

For every state s = {q,v,l) the set of distributions Steps(5') is determined by the following rules: 

Stop if v(f) = MAX _T I ME then [{stop,v,l) 1] G Steps(5) 

Time if v + 1 |= Inv{q) and v{t) + 1 < MAX_TIME and {M{q' ,g' ,prob') G \]{q' = q^i,v^ g')) then 
[(^,V + 1,1) 1-^ 1] G Steps(i') 

Urgent if v(f) / MAX_TIME and {q,g,prob) G U and i, V then [(^j, v\7i,i2^'(i,T]i)) pi,..., 

(^«,v\7„,J2/(l,T7„)) p„] G Steps(5) where pwb = [{ai,rii,Yi,q[) h-^ pi, . . . , (a„, T]„, 7„,^;) h-^ 

Pn] 

^The fresh location stop is added to terminate the activity of the automaton when the maximum time is reached. 
^This clock is needed to count the global elapsed time. 
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Non-Urgent if v{t) ^ MAX.TIME and {q,g,prob) G E and i,v\=g and {q' , g' , prob') G l]{q' = q^ 
i,v ^g')) then [{q\,v\yi,.<2f{i,r]i)) ^ pi,...,{q'„,v\Yn,£^ii,rin)) ^ Pn] G Steps(5) where 
prob= [{aurii,Yi,q[) ^ pi,. . . ,{an,ri,„Yn,q'n) ^ Pn] 

When rule Stop is applicable then no other rule is applicable. The process goes unconditionally to 
the stop location in which it stops. Rule Time lets one time unit to elapse, provided that the invariant of 
the current location will be satisfied at the reached state, that the maximum time was not reached and that 
no urgent transitions (i.e. that do not permit time passing) are enabled. Rule Urgent inserts in Steps(5') 
all possible distributions that derive from urgent transitions. Note that in case of more than one urgent 
transitions enabled all are inserted in Steps (5') and thus a non-deterministic choice is done among them 
by the MDR The resulting distributions are essentially the same of the original automaton, but here all 
the operations are performed on the clocks and on the variables to calculate the resulting state. The last 
rule Non-Urgent is applicable only if there are not urgent transitions enabled. The effect is the same of 
the urgent case and among different enabled non-urgent transitions a non-deterministic choice is done by 
the MDR 

Proposition 2.2. Given an EPDTA T and a natural number MAX_TIME it is possible to construct a 
Markov decision process IT in the syntax readable by the model checker PRISM such that IT and the 
semantics of T are the same Markov decision process. 

We plan to provide an automatic tool for this translation inside our simulator FIShR^Ss (see Sec- 
tion |4.3| ). This is very important because having the PRISM equivalent model improves the tests that the 
biologists can do against the probabilities put in the model itself. This is because quantitative questions 
can be asked to the model checker to test hypothesis made about the model or to validate it with available 
real data. A very powerful and useful logic language. Probabilistic CTL (PCTL) 1120112111 . is suitable for 
expressing such questions. 

3 Sole Characteristics and Behaviour: an EPDTA Model 

The body of the common sole (Solea solea) is egg-shaped and flat |[I]|T6l[l2l. The maximum body height 
is equal to 1/3 of the total length. The eyes are on the right side, the upper one slightly anterior to the 
lower. Both pectoral fins are well developed, the left one being somewhat smaller than the right one. 
The dorsal fin begins anterior to the eyes, by the mouth. The last rods of the dorsal and the anal fins are 
connected to the caudal fin, which is round. The colour on the eyed side of the body is greyish-brown to 
reddish-brown, with large and diffused dark spots. The pectoral fin has a blackfish spot at its distal half. 
The posterior margin of the caudal fin is generally dark. This common sole species lives in the eastern 
Atlantic, from Scandinavia to Senegal and in the entire Mediterranean. It is rare in the Black Sea. 

Here we present an individual model of a sole living in the North Adriatic sea as an EPDTA. This 
model is quite adaptable for other species of fish or soles of different environments by varying the differ- 
ent characteristic functions and probability tables that are embedded in the model itself. The quantitative 
information (lengths, probabilities, offspring estimation) was elaborated in collaboration with the Insti- 
tute of Marine Sciences of Ancona, Italy and members of their project SoleMon |[T5llT6l . As usually for 
this kind of fish, sole are categorized into so called classes which represents soles of similar age/length 
and thus with similar behaviour and subject to similar natural mortality or fishing. In the Adriatic sea, 
according to the project SoleMon, sole of age class O-i- aggregates inshore along the Italian coast, mostly 
in the area close to the Po river mouth; age class l-i- gradually migrates off-shore and adults concentrate 
in the deepest waters located at South West from Istria peninsula. Growth analyses on this species have 
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been made using otoliths, scales and tagging experiments. An otolith is a structure in the saccule or utri- 
cle of the inner ear, specifically in the vestibular labyrinth f24'| , whose section presents several concentric 
rings, very much like those of the tree trunks. By measuring the thickness of individual rings, it has been 
assumed (at least in some species) to estimate fish growth because fish growth is directly proportional to 
otolith growth. However, some studies disprove a direct link between body growth and otolith growth. 

A great variability in the growth rate was noted: some specimens had grown 2 cm in one month, 
while others, of the same age group, needed a whole year. The von Bertalanffy growth function ||9l 
(VBGF) introduced by von Bertalanffy in 1938 predicts the length of a fish as a function of its age: 



fvB{age) =U 



The length ifvB{age)) obtained is expressed in centimetres while age and to are in months; the different 
parameters that occur in the function are partly constants and partly calculated for our specific sole case 
study. Loo is not the maximum length of the animal but the asymptote for the model of average length-at- 
age , K is the so-called Brody growth rate coefficient which, if varied, allow to manipulate the growing 
function in order to represent periods of low food or abundance of food (so the soles grow less or more 
having the same age) and fo is the time or age when the average size is zero. There parameters of VBGF 
for the Adriatic sole have been calculated using various methods. Within the framework of the SoleMon 
project, growth parameters of sole were estimated through the length-frequency distributions obtained 
from surveys. The results are Loo = 39, 6 cm, K = 0.44 and to = —0.46. With this correspondence we 
can calculate the length of a sole of age age and consequently put it in one of the length classes. In the 
following table the ranges of the classes are shown: 



Class 


Minimum length (cm) 


Maximum length (cm) 








18.3 


1 


18.4 


25.8 


2 


25.9 


30.7 


3 


30.8 


33.9 


4+ 


34 


39.6 



Considering the relevance of K for the purpose of the growth function we defined in general a func- 
tion fvB{age,t) where the parameter t is an absolute month such that different periods could have a 
different K. The absolute month t = can be linked to a particular month of a particular year: in this 
way known past periods of low food or other environmental events can be represented in the growing 



function of the model. In the simulation of Section 4.4 we used always the same K along time. Knowing 
the length, it is possible to estimate the weight using the length-weight relationship: 
Weight{l) = a-l^. The parameters have been estimated in SoleMon: a = 0.007, b = 3.0638. Using this 
relationship we can determinate, at every instant during our simulations, the biomass of the whole stock 
under simulation. 

Natural mortality (not including fishing) has been estimated through a mortality index M available 
from the SoleMon project. From this annual index a probability distribution has been derived: PrM(i,0 
is the probability that in a given month t a sole in class / dies for natural mortality. Fixing a specific 
month for f = the values of the function are cyclic of a period of 12 months. However, in a simulation 
of several years the index can be varied in different years and months with a very fine granularity. This 
permits to represent in a simulation catastrophic periods or particularly favourable ones. The mortality 
probabilities PrM(i,0 (on a month basis per class) used in the simulations whose results are shown in 



Section 4.4 are the following: 
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Class 


Jan. 


Feb. 


Mar. 


Apr. 


May 


Jun. 


Jul. 


Aug. 


Sep. 


Oct. 


Nov. 


Dec. 





0.083 


0.078 


0.073 


0.068 


0.063 


0.058 


0.058 


0.053 


0.048 


0.043 


0.038 


0.033 


1 


0.032 


0.031 


0.030 


0.023 


0.030 


0.028 


0.028 


0.028 


0.027 


0.026 


0.026 


0.025 


2 


0.024 


0.024 


0.023 


0.023 


0.023 


0.023 


0.023 


0.022 


0.022 


0.022 


0.021 


0.021 
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0.021 


0.021 


0.021 


0.021 


0.021 


0.021 


0.021 


0.021 


0.021 


0.021 


0.021 


0.021 


4+ 


0.020 


0.020 


0.020 


0.020 


0.019 


0.019 


0.019 


0.019 


0.019 


0.019 


0.018 


0.018 



Mortality for fishing is estimated by a fishing index F . With the same reasoning done for the natural 
mortality a probability has been derived: Vxf{i,t) is the probability that in a given month t a sole in class 
/ is fished. In this case the periods of no fishing can be represented in the model. Similarly to the previous 
case the probability table can be cyclic over years or can be personalised month per month. The fishing 
index can be F = 0, meaning that there is no fish, can be moderate (estimated F = 0.2) or can be so 
strong that a situation of overfishing may occur (typically F > 1). For instance, the fishing probabilities 
Vrp{i,t) (on a month basis per class) corresponding to a fishing index F = 0.2 are the following: 



Class 


Jan. 


Feb. 


Mar. 


Apr. 


May 


Jun. 


Jul. 


Aug. 


Sep. 


Oct. 


Nov. 


Dec. 





1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 
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1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 
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1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 
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1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


4+ 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 


1.65 



Breeding is another important aspect of the life of soles that has been embedded in the model. In this 
case two estimations are needed. The first one is the probability of being reproductive after m months 
since the last breed, that we denote Pr/j(m). For simplicity, this probability has been estimated as for 
m = 0, 1, . . . , 11 and as 1 for all m> 12. However, this can be changed and refined in future versions 
of the model. If a sole passes this check of fertility then there is the probability of breeding Pre(/,f). 
In this case, of course, soles in class have this probability equals to 0. Soles of higher classes have 
higher probability to breed, but only in the appropriated months, which are from November to March of 
every year. This probability is then spread along these months. The table used in our simulations is the 
following: 



Class 


Jan. 


Feb. 


Mar. 


Apr. 


May 


Jun. 


Jul. 


Aug. 


Sep. 


Oct. 


Nov. 
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0.1 
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Note that in the case of breeding there is a potential artificial situation in a simulation. It can happen 
that one sole of the higher classes do not breed at all along one year. It is unlikely, but in the simulation 
may happen. This is absolutely not possible in reality. We plan to adapt the model in the future in order 
to avoid this situation. Another weakness of the model (and obviously of the simulator) is the lack of 
information about the offspring of an individual. This is a forced missing because no real information is 
known about the number of eggs that are fecundated nor the number of eggs that hatch, becoming new 
individuals. This issue is managed from the tool for the purposes of the simulation and will be treated in 
details in Section 1431 

The resulting overall model is shown in Figure [2] Two clocks are used, x for counting the passage 
of one month and t, never reset, for measuring the absolute time since the beginning. Note that only 
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Demography of the classes over time (in months) 
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Figure 3: Demography over time without fishing {F = 0.0). 



the transitions from one generic location "class /" are shown. The whole model is simply the resulting 
EPDTA considering the locations for all the classes, while the locations "dead" and "fished" are the 
same for all the classes. Every class has its particular fishing, mortality and breeding probability (e.g. a 
smaller, thus younger sole, has less probability to die/be fished than a older one). The boolean variables 
Mc,Fc,Rc Sire, used to assure that every month the sole makes a mortality check (M^), a fishing check 
(Fc) and a reproduction/breeding check (Re)- Note that time can advance of one month only if all these 
checks are done. The urgencjj^is used for forcing the class change of the class as soon as the sole reaches 
the minimum length for that class. The labels dead J, fish J and breed J are used to communicate to the 
simulation environment that a particular sole (the one sending the signal) of class / died, was fished or 
breed. The meaning of the integer variables age, length is obvious and the variable lastB counts the 
months elapsed since the last breeding of this sole. The probability tables and the functions used in 
Figure have been all described above. 



4 Simulation 

As shown at the end of Section |2] the probability model can be automatically checked to discover inter- 
esting properties or to make consistency checks on the given probabilities. Another important way to use 
the same model is for simulating a sole in a population of them. The idea is not new, but it very naturally 
fits in the fish stock monitoring. If we want to predict what happens to a population after several years 
of fishing of a certain strength under normal or particular conditions, all we have to do is to instruct a 

^In the picture the urgent transitions are indicated by a httle "u" attached at the beginning of their arrows. 
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Demography of the classes over time (in months) 
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Figure 4: Demography over time with light fishing {F = 0.2). 

population of virtual soles that uses the given model as probabilistic behaviour and let them evolve over 
time. In every moment, our virtual environment can show to us all the statistics we want to know about 
the whole stock but also about every single sole. 

Naturally the hard thing to do is to precisely tune the model with the most possible available real 
data. This work has to be done before the simulations on the model can be considered to have a certain 
degree of reliability. 

In this section we show the simulator that we have developed for reaching this goal. It uses agent 
technology, as we discuss in the following. We are currently at the very initial phase of the tuning of 
probabilities and other values using real data. The implementation is quite stable and the adaptability to 
other species can be done quite easily. More information is available at Q. 

4.1 Multi-Agent Systems 

Several definitions have been given for the term "agent" during the last decades, the most suited of which 
is the one given from Russel: an agent is something that can retrieve information from the environment 
through its sensor and can perform actions with its actuator [27 1. Alternatively, Woldrige and Jenning 
lfT9ll30l define agent as hardware or software-based computer system that have the following properties: 
autonomy, reactivity, pro-activeness, and social ability. A Multi-Agent System (MAS) is a collection of 
autonomous agents that communicate, cooperate, share knowledge and solve their own problem. 

In a MAS, each agent can be either cooperative or selfish; in other words the single agent can share 
a common goal with the others (e.g. an ant colony), or they can pursue their own interests (as in the free 
market economy). MAS are usually exploited when the problem considered cannot be solve (efficiently) 
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Figure 5: Demography over time with overfishing {F = 1.2). 

by an individual agent or a monolithic system. They are used to model coordinated defence systems 
but also for disaster response models, social network modelling, transportation, logistics, graphics as 
well as in many other fields when the problem is non-linear or the interaction with flexible individual 
participants have to be represented or again when in-homogeneous space is relevant. Finally, MAS are 
widely used in networking and mobile technologies, to achieve automatic and dynamic load balancing, 
high scalability, and self-healing networks. 

In the context of a MAS, an agent needs to communicate its information to the others and after that 
it needs to coordinate its activities (which is important to prevent conflicts between the agent belonging 
to the MAS) and negotiate its interest to solve a problem without conflicts. This need of interaction 
and exchange of information between agents is the basic characteristic that differentiate MASs from 
traditional artificial intelligence which work only as a single agent. 



4.2 Hermes middleware 

Hermes lfT4l |3l is an agent-based middleware for designing and execution of activity-based applications 
in distributed environments. It provides an integrated environment where users can focus on the design 
of the particular activity of interest ignoring the topological structure of the distributed environment. 
Hermes consists of a 3-layer software architecture: the Agent layer, the BasicServices layer and the Core 
layer. 

An Hermes execution consists of a creation of a MAS in which the agents are of two kinds: user 
agents and service agents. The Agent Layer is the upper layer of the mobile platform that contains both 
kinds of agents. A service agent accesses to local place resources such as data and tools (which, for 
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security reason, are not directly accessible) while a user agent executes complex tasks and implement 
part of the logic of the application. Hermes is based on the concept of place: a place is a well defined 
node of a network where service agents are located. When a service agent is created on a place and 
bound to it, there is no way for it to migrate to another place of the network. User agents can instead be 
copied to another place (weak mobility) and their execution can continue on the migration place. 

4.3 FIShPASs: Fishing Stock Probabilistic Agent-based Simulator 

FIShPASs ill is a simulator based on Hermes. It exploits the agent paradigm to simulate, as a MAS, the 
evolution of a population of fish of a certain species. At the moment a model for a sole (Solea solea) 
population living in the North Adriatic sea is available, but the simulator can be easily adapted for other 
species or soles of different environments 

The basic elements of the simulator are: the SoleaAgent, the SquareKilometreSea, the Registry 
and the Randomizer. The SquareKilometreSea is exactly the Hermes place where the simulation is 
started. This first version of the simulator is quite simple since the sea is not spatially simulated; it is 
more like a simple container for the population of soles (exactly what is an Hermes place for its agents). 
In the near future we point to improve the overall model with support to space and thus displacement of 
sole in the simulated sea, water currents, temperature and so on. 

The SoleaAgent is a user agent and represents a single sole in the simulated sea. While the spa- 
tial model is not so accurate, the sole behavioural model is quite complex. Indeed the SoleaAgent 
implements the EPDTA presented in Section [3] and thus can be fished, can die for natural mortality, 
can reproduce with the given probabilities, and naturally grows as time passes. Note that also the non- 
deterministic choices that are presented to the agent (as its behaviour is essentially a Markov decision 
process) are resolved probabilistically with a uniform distribution on all the enabled non-deterministic 
choices. As briefly discussed in Section [3j since the simulation is managed on a month basis, we can 
arrange theoretical probability so that certain behaviour cannot occur or can occur rarely in certain peri- 
ods of the year. For instance, we can suppose that the fishing period goes (hypothetically) from October 
to February while in the other months fishing is prohibited and fix the fishing probability to for the 
prohibited period. Since the probability values are given also on the basis of the length of the sole, the 
model can be easily adapted to different scenarios to simulate, for example, overfishing of some classes 
or sudden reduction of the population of some other classes because of a disease and so on. 

The third element of the simulation is the Randomizer. It is a service agent which generates random 
numbers in [0, 1] for the sole. Every time a sole checks the probability of doing something (death, 
reproduction, etc.) it requests a new number to the Randomizer. The service returns a new generated 
value that is contrasted with the probability of the individual to decide whether the action occurs or not. 

The last main element of the simulator is the Registry. Like the Randomizer it is a service agent 
and it simply keeps track of the sole available in the simulated sea. Its main purposes are the consistency 
control over the simulation and the generation of statistics about the simulated months (in particular, 
population per different class, weight of the biomass, number of death/fished soles in the last month). 
At the beginning of the simulation the Registry reads the input data and computes the number of 
individuals of the initial population then waits for them to communicate their status to the registry itself. 

SoleAgents are programmed to communicate their status to the Registry every month in any 
case (even in case of death), which means that the Registry can always know if all the soles have 
communicated with it during the current interaction. Moreover, it always knows the exactly demography 
of the population. This communication acts also as a synchronization that ensure time consistency on the 
population. It is the Registry that manages time increments and that enables the sole to execute their 
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internal behaviour (setting the SoleaAgent variable corresponding to the local clock "x" of the model of 
Figure|2]to 1). Before each increment the Registry waits for a communication from all the population 
of soles and then it increments the time. In this way the Registry ensure that at each simulation step 
(i.e. each month) no sole is out of simulation time range (behind or beyond the current time). 

Finally, the Registry generates new soles if some of the existing ones reproduced during the last 
month. Having the generation in the Registry is a strategic choice to be sure that the new born sole 
will be correctly set in the current time frame. In reality, a female sole produces, depending on its class, 
between 150000 and 250000 eggs and spreads them in the water. The number of them that will grow 
at least until class 1 is very low. Currently there is not a direct known relation between the number of 
females that breed in a month or in a season and the number of surviving and developing eggs in the 
following months/season. Thus, the Registry creates every month a number of newborn soles in class 
that corresponds to the number observed in reality (data from SoleMon project). One challenge for the 
future will be trying to find a relation between the signals of breed (breed/) given by the SoleAgents to 
the Registry in a certain period and the number of surviving newborn to introduce after some month(s). 

Given the real data about individuals in the different classes from 2005 to 2008, we taken the first 
column, which represents the newborns (males + females) of every year, halved the values (since we 
consider only female soles) and distributed the newborns so obtained along the year, according to the 
previous fertility table. 



population km'^ 





1 
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4+ 


2005 


169 


82 


36 


12 


4 


2006 


92 


179 


43 


10 
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2007 


205 


138 


72 
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1 


2008 


117 


123 


61 


10 


6 



In such a way we obtain the birth rate table below. It represents the amount of newborns that are 
automatically generated from the simulator every simulated month. Since the table covers only 4 years 
it is used cyclically in the subsequent years, thus the fifth year the generated newborns come from the 
first row of the table and so on. 
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Summing up, the FIShPASs simulation steps are the following: 

1. the sea (place) is launched and the service agents (Randomizer and Registry) are generated on 
it. The Registry calculates the initial population 

2. the sole population is generated from the place basing on the SoleMon project data 

3. the soles register to the Registry 

4. once all population has signed to the Registry, it generates statistics and starts their behavioural 
simulation by sending them a message to update their internal clock x 

5. the soles execute all their operations for the current month, reset the clock x and send an ack to the 
Registry 
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Total biomasses over time (kilograms/months) 
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Figure 6: Biomass over time witliout fisliing {F = 0.0.) 

6. once all the acks are received by the Registry, it generates statistics for the elapsed month, creates 
newborn soles and then sends a new message to increment the clock x 

The last two points are repeated until the desired period of time has passed. 
4.4 Results 

We set up a series of simulations to test our model. Every simulation has a timestep one month long and 
lasts for 72 months (6 years) considering always a virtual square kilometre of sea. The three charts of 
Figures [3} |4] and [5] show the trend of population that can occur, along the simulation, varying the fishing 
probability (i.e. the probability for a sole to be fished) while mortality and reproduction probabilities 
remain the same. The charts concentrate on three very different scenarios. The first one considers a case 
of no fishing (fishing index F=0.0). In this case the population remains stable, spreading on the different 
classes. It is particularly notable an increment of soles in the third class as well as a gradual and steady 
increment of individual in the fourth class. 

The second case considers a light fishing activity (F=0.2). The scenario rapidly changes with classes 
third and fourth that grow slower than in the previous chart. Moreover all of them has difficulty to get 
over 20 individuals (whereas in the previous plot all of the bigger classes where around 40 individuals). 
The trend is that of a decreased population of soles composed of less mature individuals and some small 
ones (see the biomasses charts for more details). 

The third and last chart about the population shows another possible scenario. In this case we suppose 
that the sea undergoes an overfishing activity (F= 1.2). This situation has obviously an extreme impact 
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Figure 7: Biomass over time with overfishing (F = 1.2). 
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Figure 8: Biomass over time with overfishing (F = 1.2). 
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on the population. As can be seen, after two years {IJi-lA on the x-axis), the population is simply gone 
with only the class of individuals available (due to their automatic generation, as explained above). 

The three charts of Figures[6}|7]and[8]represent the biomass trend, i.e. the total amount (in kilograms) 
of soles for the different classes in the three scenarios described above. In the case of no fishing (F=:0.0) 
the sole population tends to spread over all the classes and the biomass grows accordingly. The biomass 
increment is constant and at the end of the 6 years the total biomass is around 28 kg (against 8 kg at the 
beginning of the simulation). 

When we introduce light fishing (F=0.2) the biomass tendency is similar to the previous scenario but 
the values are totally different. In particular the population is impoverished in the higher classes (see 
previous plots) and the overall biomass grows slower at the beginning with a more marked decreasing 
tendency during the last year (months 60 - 73) when soles grow, reaching class fourth and thus are easier 
to be fished. At the end of the simulation the total biomass is around 13 kg which means less than an half 
of the soles biomass without fishing. 

With the introduction of overfishing (F=1.2) the scenario changes drastically. The fishing activities 
has a great impact on the population biomass that is halved at the beginning of the second year (13-15 
on the X-axis) and then runs fast under 1 kg in the middle of the forth year (41-42 on the x-axis). All the 
bigger soles, more subject to fishing, have been caught or are dead and only the smaller ones (i.e. with 
a small biomass) remain (also because they are auto-generated). Again, as seen in the corresponding 
classes chart, the population is decimated. 

More details and charts about these simulations can be found on the simulator website El along with 
contacts to request a copy of the current version of the tool. 

5 Conclusions and Future Work 

We have defined an individual-based model of the behaviour of a common sole (Solea soled) living in 
North Adriatic sea. The model has been specified as an Extended Probabilistic Discrete Timed Au- 
tomaton (EPDTA), a formalism that is a variant of probabilistic timed automata. We have defined the 
semantics of an EPDTA as a Markov decision process and we have observed that an EPDTA can be 
translated to a syntax acceptable by the model-checker PRISM. The estimation of the probabilities and 
of the characteristic function of the species has been done by using the real data of the SoleMon project. 
The individual probabilistic behaviour then has been embedded into an agent of a MAS. The MAS sim- 
ulates the population of soles over time and can provide information on the evolution of the stock by 
monthly statistics of the individual states. We have presented the simulator FIShPASs (Fishing Stock 
Probabilistic Agent-based Simulator) that implements the presented model and is easily adaptable for 
other species. 

There are a lot of interesting things to do as future work. First, we want to tune the model, working 
in team with specialized biologists, in order to increase the confidence on its predictions. The translation 
of the model into a PRISM acceptable syntax can be made available inside the simulation environment. 
Having the PRISM equivalent model can highly improve the tests that the biologists can do against the 
probabilities put in the various tables embedded in the model. This is because quantitative questions 
can be asked to the model checker to test hypothesis made about the model itself or to validate it with 
available real data. In the MAS part a huge number of improvements are possible. For instance, soles can 
be given a geometrical space to occupy and can move in the simulated square kilometre. They can also 
emigrate and immigrate from/to the simulated space. A 3D environment, i.e. a cube kilometre, instead of 
a 2D one could be more appropriate because other species could be simulated simultaneously and made 
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interact with the soles (towards a more predator-prey approach). Moreover, a physical conformation 
of the territory can be added to the model possibly influencing the interactions (of different kind, to 
be introduced in the model too) between the individuals (the formation of an isolated population, the 
impossibility to meet, etc.). Finally, the effects of the passage of a particular fishing device can be 
modelled; for this we know there are available data for tuning/ validation. 
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